AITopics | movement pruning

Collaborating Authors

movement pruning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

eae15aabaa768ae4a5993a8a4f4fa6e4-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 23:16:40 GMT

magnitude pruning, movement pruning, pruning, (13 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Genre: Research Report (0.68)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

eae15aabaa768ae4a5993a8a4f4fa6e4-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-10-2026, 23:16:28 GMT

artificial intelligence, machine learning, pruning, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

Movement Pruning: Adaptive Sparsity by Fine-Tuning

Neural Information Processing SystemsDec-24-2025, 20:27:25 GMT

Magnitude pruning is a widely used strategy for reducing model size in pure supervised learning; however, it is less effective in the transfer learning regime that has become standard for state-of-the-art natural language processing applications. We propose the use of movement pruning, a simple, deterministic first-order weight pruning method that is more adaptive to pretrained model fine-tuning. We give mathematical foundations to the method and compare it to existing zeroth-and first-order pruning methods. Experiments show that when pruning large pretrained language models, movement pruning shows significant improvements in high-sparsity regimes. When combined with distillation, the approach achieves minimal accuracy loss with down to only 3% of the model parameters.

adaptive sparsity, movement pruning, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Movement Pruning: Adaptive Sparsity by Fine-Tuning

Neural Information Processing SystemsAug-17-2025, 03:05:02 GMT

Magnitude pruning [Han et al., 2015, 2016b], which preserves weights with high absolute values, is

machine learning, natural language, pruning, (16 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Genre: Research Report (0.68)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

We sincerely thank the reviewers for sharing their valuable feedback while pointing out weaknesses in our work and 1 suggesting presentations improvements

Neural Information Processing SystemsAug-17-2025, 03:04:50 GMT

R1 - Is this distillation only on the training set, or is there data augmentation? The model is trained solely on the training set. We follow the vanilla setup described in Hinton et al. [2014]. R1 - Can the authors comment how movement pruning might work for generative tasks? R2 - As with most work on pruning, it is not yet possible to realize efficiency gains on GPU.

artificial intelligence, machine learning, pruning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

Review for NeurIPS paper: Movement Pruning: Adaptive Sparsity by Fine-Tuning

Neural Information Processing SystemsFeb-7-2025, 18:56:24 GMT

Additional Feedback: - For the results in Figure 2, what does the x-axis represent for models with different numbers of parameters? For example, if a MiniBERT model has half as many parameters as BERT-base, then comparing "10% remaining weights" seems a bit unfair. What would the figure look like if the x-axis were instead the number of non-zero parameters? - You evaluate on one span extraction and two paired sentence classification tasks, but no single sentence classification tasks. Why not replace one of the sentence pair tasks with SST-2, for example? I expect the results would be similar, but it would make the experiments a bit more compelling.

adaptive sparsity, movement pruning, neurips paper, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.66)

Add feedback

Movement Pruning: Adaptive Sparsity by Fine-Tuning

Neural Information Processing SystemsJan-13-2025, 20:19:09 GMT

Magnitude pruning is a widely used strategy for reducing model size in pure supervised learning; however, it is less effective in the transfer learning regime that has become standard for state-of-the-art natural language processing applications. We propose the use of movement pruning, a simple, deterministic first-order weight pruning method that is more adaptive to pretrained model fine-tuning. We give mathematical foundations to the method and compare it to existing zeroth- and first-order pruning methods. Experiments show that when pruning large pretrained language models, movement pruning shows significant improvements in high-sparsity regimes. When combined with distillation, the approach achieves minimal accuracy loss with down to only 3% of the model parameters.

adaptive sparsity, fine-tuning, movement pruning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Pruning Pre-trained Language Models Without Fine-Tuning

Jiang, Ting, Wang, Deqing, Zhuang, Fuzhen, Xie, Ruobing, Xia, Feng

arXiv.org Artificial IntelligenceMay-16-2023

To overcome the overparameterized problem in Pre-trained Language Models (PLMs), pruning is widely used as a simple and straightforward compression method by directly removing unimportant weights. Previous first-order methods successfully compress PLMs to extremely high sparsity with little performance drop. These methods, such as movement pruning, use first-order information to prune PLMs while fine-tuning the remaining weights. In this work, we argue fine-tuning is redundant for first-order pruning, since first-order pruning is sufficient to converge PLMs to downstream tasks without fine-tuning. Under this motivation, we propose Static Model Pruning (SMP), which only uses first-order pruning to adapt PLMs to downstream tasks while achieving the target sparsity level. In addition, we also design a new masking function and training objective to further improve SMP. Extensive experiments at various sparsity levels show SMP has significant improvements over first-order and zero-order methods. Unlike previous first-order methods, SMP is also applicable to low sparsity and outperforms zero-order methods. Meanwhile, SMP is more parameter efficient than other methods due to it does not require fine-tuning.

machine learning, natural language, pruning, (19 more...)

arXiv.org Artificial Intelligence

2210.0621

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Beijing > Beijing (0.05)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Structured Pattern Pruning Using Regularization

Park, Dongjun, Lee, Geung-Hee

arXiv.org Artificial IntelligenceSep-17-2021

Iterative Magnitude Pruning (IMP) is a network pruning method that repeats the process of removing weights with the least magnitudes and retraining the model. When visualizing the weight matrices of language models pruned by IMP, previous research has shown that a structured pattern emerges, wherein the resulting surviving weights tend to prominently cluster in a select few rows and columns of the matrix. Though the need for further research in utilizing these structured patterns for potential performance gains has previously been indicated, it has yet to be thoroughly studied. We propose SPUR (Structured Pattern pruning Using Regularization), a novel pruning mechanism that preemptively induces structured patterns in compression by adding a regularization term to the objective function in the IMP. Our results show that SPUR can significantly preserve model performance under high sparsity settings regardless of the language or the task. Our contributions are as follows: (i) We propose SPUR, a network pruning mechanism that improves upon IMP regardless of the language or the task. (ii) We are the first to empirically verify the efficacy of "structured patterns" observed previously in pruning research. (iii) SPUR is a resource-efficient mechanism in that it does not require significant additional computations.

matrix, pruning, spur, (13 more...)

arXiv.org Artificial Intelligence

2109.08814

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback